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Abstract 

We survey recent results on combinatorial optimization problems in which the objective function 
is the entropy of a discrete distribution. These include the minimum entropy set cover, minimum 
entropy orientation, and minimum entropy coloring problems. 

1 Introduction 

Set covering and graph coloring problems are undoubtedly among the most fundamental discrete op- 
timization problems, and countless variants of these have been studied in the last 30 years. We discuss 
several coloring and covering problems in which the objective function is a quantity of information 
expressed in bits. More precisely, the objective function is the Shannon entropy of a discrete proba- 
bility distribution defined by the solution. These problems are motivated by applications as diverse as 
computational biology, data compression, and sorting algorithms. 

Recall that the entropy of a discrete random variable X with probability distribution {pi}, where 
Pi := P[X = i], is defined as: 

H(X) = -J^PilogPi. 

i 

Here, logarithms are in base 2, thus entropies are measured in bits; and OlogO := 0. From Shannon's 
theorem [34], the entropy is the minimum average number of bits needed to transmit a random 
variable on an error-free communication channel. 

We give an overview of the recent hardness and approximability results obtained by the authors 
on three minimum entropy combinatorial optimization problems ]5j [6l . We also provide new ap- 
proximability results on the minimum entropy orientation and minimum entropy coloring problems 
(Theorems [3] [7] [8j and [9]). Finally, we present a recent result that quantifies how well the entropy of a 
perfect graph is approximated by its chromatic entropy, with an application to a sorting problem J9|l • 

2 Minimum Entropy Set Cover 

In the well-known minimum set cover problem, we are given a ground set U and a collection S of 
subsets of U, and we ask what is the minimum number of subsets from S such that their union is U. 
A famous heuristic for this problem is the greedy algorithm: iteratively choose the subset covering the 
largest number of remaining elements. The greedy algorithm is known to approximate the minimum 
set cover problem within a 1 + Inn factor, where n := \U\. It is also known that this is essentially 

* A preliminary version of the work appeared in 1 8 ] . 
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Figure 1 : Instances of the minimum entropy combinatorial optimization problems studied in this pa- 
per, together with feasible solutions. The resulting probability distribution for the given solutions is 

{5/11,4/11,2/11}. 



the best approximation ratio achievable by a polynomial-time algorithm, unless NP has slightly super- 
polynomial time algorithms 111411 . 

In the minimum entropy set cover problem, the cardinality measure is replaced by the entropy 
of a partition of U compatible with a given covering. The function to be minimized is the quantity 
of information contained in the random variable that assigns to an element of U chosen uniformly at 
random, the subset that covers it. This is illustrated in Figure 1 (a) A formal definition of the minimum 
entropy set cover problem is as follows 11221 . 

instance: A ground set U and a collection S = {Si, . . . , Sk} of subsets of U 
solution: An assignment <j> :U {l,...,k} such that x € S$( x ) f° r sllx e U 
objective: Minimize the entropy — Yli=i Pi logpi, where pi := |^> _1 (i)|/|f7| 

Intuitively, we seek a covering of U yielding part sizes that are either large or small, but somehow as 
nonuniform as possible. Also, an arbitrarily small entropy can be reached using an arbitrarily large 
number of subsets, making the problem quite distinct from the minimum set cover problem. 



Applications. The original paper from Halperin and Karp H221 was motivated by applications in com- 
putational biology, namely haplotype reconstruction. In an abstract setting, we are given a collection 
of objects (that is, the set U) and, for each object, a collection of classes to which it may belong (that 
is, the collection of sets 5, containing the object). Assuming that the objects are selected at random 
from a larger population, we wish to assign to each object the most likely class it belongs to. 

Consider an assignment 0, and suppose g, is the probability that a random object actually belongs 
to class i. We aim at maximizing the product of the probabilities for the solution <j>: 

n ^o)- 

xeu 

Now note that if is an optimal solution (supposing we know the values gj, then the value 
Pi |</i" 1 (i)|/|C/| is a maximum likelihood estimator of the actual probability g,. Thus, since the 
probabilities are unknown, we may replace % by its estimated value: 

n*«=iK~ lwl - a) 

xeu i=i 
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Figure 2: The haplotype phasing problem (simplified setting). 

Maximizing this function is equivalent to minimizing the entropy — J2i=i Pi l°gPi, leading to the mini- 
mum entropy set cover problem. In the haplotype phasing problem, objects and classes are genotypes 
and haplotypes, repectively. In a simplified model, a genotype can be modeled as a string in the alpha- 
bet {0, 1, ?}, and a haplotype as a binary string (see Figure [2]). A haplotype explains or is compatible 
with a genotype if it matches it on every non-? position, and we aim at finding the maximum likeli- 
hood assignment of genotypes to compatible haplotypes. It is therefore a special case of the minimum 
entropy set cover problem. Experimental results derived from this work were proposed by Bonizzoni 
et al. [3]|, and Gusey, Mandoiu, and Pas_aniuc H20L 



Results. We proved that the greedy algorithm performs much better for the minimum entropy version 
of the set cover problem. Furthermore, we gave a complete characterization of the approximability of 
the problem under the P ^ NP hypothesis. 

Theorem 1 ([7]). The greedy algorithm approximates the minimum entropy set cover problem within an 
additive error of log e (~ 1.4427) bits. Moreover, for every e > 0, it is NP-hard to approximate the problem 
within an additive error of log e — e bits. 

Our analysis of the greedy algorithm for the minimum entropy set cover problem is an improve- 
ment, both in terms of simplicity and approximation error, over the first analysis given by Halperin 
and Karp H22H . The hardness of approximation is shown by adapting a proof from Feige, Lovasz, and 
Tetali on the minimum sum set cover problem H15L which itself derives from the results of Feige on 
minimum set cover 111411 . 

Analysis of the Greedy Algorithm by Dual Fitting. We present another proof that the greedy algo- 
rithm approximates the minimum entropy set cover problem within an additive error of log e. It differs 
from the proof given in [ 7 ] in that it uses the dual fitting method. 

Let OPT denote the entropy of an optimum solution. Let S* be the collection of subsets of sets in 
S, that is, S* := {S : 3S' e5,SC S'}. The following linear program gives a lower bound on OPT: 



z — ' V n n 

ses* 



• x s 



s.t. 



x s = 1 VveU (2) 

x s >0 VS e S* 

Indeed, if we add the requirements that x s E {0, 1} for every set S, we obtain a valid integer program- 
ming formulation of the minimum entropy set cover problem. 
The dual of (|2]) reads: 

max ^2 y v 

s.t. ) y v <~— log— WSES* 
n n 
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In the case of the minimum entropy set cover problem, the greedy algorithm iteratively selects a set 
S e 5 that covers a maximum number of uncovered elements in U, and assigns the latter elements 
to the set S. Let I be the number of iterations performed by this algorithm on input (U,S). For 
i e {1, ...,£}, let Si be the set of elements that are covered during the ith iteration. (Thus Si € S*.) 
The entropy of the greedy solution is 

='-9- (4) 

A — ' n n 

»=i 

Now, for every v e U, let be defined as 

1 , \Si\-e 

Vv ■= --log , 

n n 

where i is the (unique) index such that v € 

If the vector § is feasible for ((3]), then we deduce that 

OPT >J2v* 

^—^ n n 

i=i 

= g - log e, 

implying that the greedy algorithm approximates OPT within an additive constant of log e. Hence, it 
is enough to prove that y is feasible for the dual, that is, J2ves Vv — ~ 1°S n ^ or ever Y S <E S*. 

Let S £ S* and, for every i e {1, . . . ,£}, let cij := |5 n Si\. At the beginning of the ith iteration of 
the greedy algorithm, all the elements in S ~ Vf^~} x Sj are not yet covered. Since the algorithm could 
cover all these elements at that iteration, we have \Si\ > \S — UjZ^Sj] = \S\ — J2]=\ a j- This implies 

ni^r>n(i5i-E«i) ■ 

i=l i=l \ 3 = 1 J 

Using that Y%=i a « = \&\> tne following inequality is easily seen to hold: 

n(isi-x>,] (6) 
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Combining ([5j and ([6]), and using the lower bound > (\S\/e)\ s \, we obtain 

a-i , \Si\-e 

Vv = 2_^ — log 

v£S t=l 



= - - V a 4 log (\Si\ ■ e) + i^i logn 
n * — ' n 

= -ilogTT (\S t \ -er + M i ogn 
n n 

i=l 

= -ilog f e IS| A j^+i^l ^ 
n \ y n 

-log felsii^A+Miogn 
n \ In 



< - 

n 

< --log|^|l s l + ^logn 

n n 

= -M log M. 

n n 

Therefore, y is feasible for ((3]), as desired. 

3 Minimum Entropy Orientations (Vertex Cover) 

The minimum entropy orientation problem is the following [ 6 ] : 

instance: An undirected graph G = (V, E) 
solution: An orientation of G 

objective: Minimize the entropy — ^2 veV p v logp^, where p v := p(v)/\E\, and p(v) is the indegree of 
vertex v in the orientation 

An instance of the minimum entropy orientation problem together with a feasible solution are given 
in Figure 1(b) | Note that the problem is a special case of minimum entropy set cover in which every 



element in the ground set U is contained in exactly two subsets. Thus it can be seen as a minimum 
entropy vertex cover problem. 



Results. We proved that the minimum entropy orientation problem is NP-hard . Let us denote by 
OPT(G) the minimum entropy of an orientation of G (in bits). An orientation of G is said to be biased 
if each edge vw with dcg(w) > deg(u>) is oriented towards v. Biased orientations have an entropy that 
is provably closer to the minimum than those obtained via the greedy algorithm. 

Theorem 2 (01). The entropy of any biased orientation of G is a most OPT(G) + 1 bits. It follows that 
the minimum entropy orientation problem can be approximated within an additive error of 1 bit, in linear 
time. 



Constant-time Approximation Algorithm for Bounded Degree Graphs. By making use of the fact 
that the computation of a biased orientation is purely local [29], we show that we can randomly 
sample such an approximate solution to guess OPT(G) within an additive error of 1 + e bits. The 
complexity of the resulting algorithm does not (directly) depend on n, but only on A and 1 /e. This is 
a straightforward application of ideas presented by Parnas and Ron H321 . 
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We consider a graph G with n vertices, m edges, and maximum degree A. Pick any preferred 
biased orientation of G. (For instance, we may order the vertices of G arbitrarily and orient an 
edge vw with deg(u) = deg(w) towards the vertex that appears last in the ordering.) The following 
algorithm returns an approximation of OPT(G) (below, s is a parameter whose value will be decided 
later). 

1 . For i = 1 to s 

(a) pick vertex Vi uniformly at random 

(b) compute the indegree p(vi) of V{ in 

2. return H := logm - ^ X)*=i Pi v i) lo SP(«i) 

The worst-case complexity of the algorithm is 0(sA 2 ). 

Theorem 3. There is an algorithm of worst-case complexity 0(A 4 log 2 A/e 2 ) that, w/ien given a graph G 
with maximum degree A and at least as many edges as vertices, returns a number H satisfying, with high 
probability, 

OPT{G) <H< OPT{G) + (1 + e). 
Proof. Let V := V(G) and OPT := OPT{G). From the previous theorem we get: 



E 



p(v) log p(v) 



= log m - — V p(v) log p(u) < OPT + 1. 

711 ' * 



vev vev 
For i = 1, . . . , s, let denote a uniformly sampled vertex of G. By linearity of expectation, we have: 



E 



^p(«i) log 



= - V p(v)logp(u). 

77 Z / 



Noting < p(vi) log p(vi) < A log A (for all i), Hoeffding's inequality then implies: 



P 



^2p(v)logp{v) -E 



^2p(v)\ogp{v) 



> es 



P 



n 

sm 



s 1 
V p(v) logp(u) V p{v) \ogp{v) 

L / 777 £■ / 



vev 



> e- 



=>P[\H- OPT\ >! + (]< P\\H - OPT\ > 1 + e 



< 2 exp 

< 2 exp 

< 2 exp ( - 



2s 2 e 



2^2 



s(AlogA) 2 

2se 2 
(A log A) 2 

2se 2 
(A log A) 2 



By letting s = 0((A log A) 2 /e 2 ), with arbitrarily high probability, we conclude that the above algorithm 
provides an approximation of OPT within 1 + e bits in time 0(sA 2 ) = 0(A 4 log 2 A/e 2 ). Note that 
this approximation can be either an under- or an over-approximation. To make the approximation 
one-sided, we can simply return H + e. □ 



4 Minimum Entropy Coloring 

A proper coloring of a graph assigns colors to vertices such that adjacent vertices have distinct colors. 
We define the entropy of a proper coloring as the entropy of the color of a random vertex. An example 



is given in Figure 1(c) The minimum entropy coloring problem is thus defined as follows: 



instance: An undirected graph G = (V, E) 
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Figure 3: Coding with side information. 



solution: A proper coloring 4> : V -> N+ of G 

objective : Minimize the entropy - YliPi l°EPi> where pi := \4»~ 1 {i)\/\V\ 

Note that any instance of the minimum entropy coloring problem can be seen as an implicit instance 
of the minimum entropy set cover problem, in which the ground set is the set of vertices of the graph, 
and the subsets are all independent sets, described implicitly by the graph structure. 

The problem studied by the authors in ]5]| was actually slightly more general: the graph G came 
with nonnegative weights w(v) on the vertices v £ V, summing up to 1. The weighted version of the 
minimum entropy coloring problem is defined similarly as the unweighted version except now we let 
Pi := Y^ve^-ifi) w ( v )- (The unweighted version is obtained for the uniform weights w(v) — 1/\V\.) 

Applications. Minimum entropy colorings have found applications in the field of data compression, 
and are related to several results in zero-error information theory R27I1 . The minimum entropy of a 
coloring is called the chromatic entropy by Alon and Orlitsky [2]. It was introduced in the context 
of coding with side information, a source coding scenario in which the receiver has access to an 
information that is correlated with the data being sent (see also H28I T 

The scenario is pictured on Figure 3(a)| Alice wishes to send a random variable X to Bob, who 



has access to a side information Y. The value Y is unknown to Alice, but allows Bob to gain some 
information on X (thus X and Y are not independent). We need to find an efficient code that allows 
Alice and Bob to communicate. This code will consist of an assignment of binary codewords to the 
possible values of X. 

To give a simple example, suppose that X e {a, b, c}, and Y e {0, 1}. The joint probability distri- 
bution P(X, Y) of X and Y is such that P(b, 1) = 0, P(c, 0) = 0, and P(x, y) > for all other pairs 
x, y. We can notice that if Alice assigns the same codeword to the values b and c, Bob will always be 
able to lift the ambiguity, and tell that X = b if his side information Y has value 0, and X = c if Y = 1. 
More generally, from the joint probability distribution of X and Y (which is supposed to be known to 
both parties), we can infer a confusability graph with vertex set the domain of X (here {a, b, c}) and 
an edge between two values x, x' whenever there exists y such that P(x, y) > and P(x', y) > 0. Any 
coloring of the confusability graph will yield a suitable code for Alice and Bob. The rate of this code 
is exactly the entropy of the coloring, taking into account the marginal probability distribution P(X). 
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Note that the confusability graph does not depend on the exact values of the probability, but only on 
whether they are equal to or not, which is typical of the zero-error setting H27H . 

Minimum entropy colorings are instrumental in several other coding schemes introduced by Doshi 
et al. H131lllllT2ll for functional data compression. It was also proposed for the encoding of segmented 
images |T). Another application is described in Section [5] 

Results. Unsurprisingly, the minimum entropy coloring problem is hard to solve even on restricted 
instances, and hard to approximate in general. We proved the following two results H . 

Theorem 4. Finding a minimum entropy coloring of a weighted interval graph is strongly NP-hard. 

The following hardness result is quite strong because the trivial coloring assigning a different color 
to each vertex has an entropy of at most log n. Actually, the approximation status of the problem is very 
much comparable to that of the maximum independent set problem H24B . (This is not a coincidence, 
see for instance the discussion below and in particular Corollary [T]) 

Theorem 5. For any positive real e, it is NP-hard to approximate the minimum entropy coloring problem 
within an additive error of (1 — e) log n. 

A positive result was given by Gijswijt, Jost, and Queyranne: 

Theorem 6 ( H19IP - The minimum entropy coloring problem can be solved in polynomial time on weighted 
co-interval graphs. 

Hardness for Unweighted Chordal Graphs. The reduction given in to prove Theorem |4]uses in 
a crucial way the weights on the vertices. It is therefore natural to ask whether the problem is also 
NP-hard on unweighted interval graphs. While we do not know the answer to this question, we show 
here that, in its unweighted version, the minimum entropy coloring problem is NP-hard on chordal 
graphs (which contain interval graphs). Our proof is a variant of the previous reduction; the main 
ingredient is a gadget that allows us to (roughly) simulate the weights. 

Theorem 7. The minimum entropy coloring problem is NP-hard on unweighted chordal graphs. 

Before proving Theorem [7J we introduce a few definitions and lemmas. Suppose q = (g,-) and 
r = (rj) are two probability distributions over N + with finite support. If Yli=i 1i — Y^i=i r i holds for 
all t, we say that q is dominated by r. The following lemma is a standard consequence of the strict 
concavity of the function x n- —xlogx (see 11 7L 12311 for different proofs). 

Lemma 1. Let q = (qi) and r = (ri) be two probability distributions over N + with finite support. Assume 
that q is nonincreasing, that is, qi > q i+ \for i > 1. If q is dominated by r, then the entropy of q is at least 
that of r, with equality if and only if q = r. 

A coloring of a graph G realizes a probability distribution q if q is the probability distribution 
induced by cf), that is, if qi = \4>~ 1 (i)\/n for all i (where n := \G\). 

Let J k be the intersection graph of the set {hj | 1 < j < i < k} of intervals, where h\ denotes the 
open interval ((j — (see Figure [4] for a representation of J 5 ). We call the independent set 

\h\ I 1 < j < i} the i-th row of J k . 

Lemma 2. The probability distribution (k/\Jk\,(k — 1)/| Jfc|, l/|Jfc|,0, .. .) dominates every proba- 
bility distribution realizable by a coloring of Jk- Moreover, every coloring realizing this distribution colors 
each row with a single color. 
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Figure 4: An interval representation of J 5 . 

Proof. For v € V(Jk), denote by \v\ the length of the interval corresponding to v. Fix i e {1, . . . , fc} 
and consider any subpartition P of V(Jf.) into i nonempty parts P 1 , . . . ,Pi satisfying 

E M < 1 (7) 

veP, 

for every 1 < j < i. By the definition of it follows 

\Pi\ + ■ ■ ■ + \Pi\ < k + (k- 1) + • • • + (k - i + 1), (8) 

with equality if and only if the parts of P correspond to the rows numbered from k — i + 1 to k. 

Now, consider any coloring of Jk and denote by Pi, . . . , Pi its color classes. Since each color class 
of (j) is an independent set, Pj must satisfy ([7]) for every j. The lemma follows then by using ([8]) for 
every i e {1, ...,£}. □ 

Proof of Theorem^ We reduce from the NP-complete problem of deciding if a circular arc graph G is 
fc-colorable H18L Given a circular arc graph G, there exists a polynomial time algorithm to construct 
a circular representation of it H36II . A key idea of our proof is to start with a circular arc graph and cut 
it open somewhere to obtain an interval graph (see [30] for another application of this technique to 
minimum sum coloring). 

Let y be an arbitrary point on the circle that is not the endpoint of any arc in the representation of 
G. Denote by k' the number of arcs in which y is included. If k' > k, then G is not fc-colorable. On 
the other hand, if k' < k, we add to the representation k — k' sufficiently small arcs that only intersect 
arcs including y. This cannot increase the chromatic number of G beyond fc. Thus we assume that y is 
contained in exactly k arcs. 

We denote by ai, . . . , au the arcs containing y. Splitting each arc a, into two parts £j and i\ at point 
y yields an interval representation of an interval graph G' . The original graph G is fc-colorable if and 
only if there exists a fc-coloring of G' in which £j and rj receive the same color for 1 < j < k. 

Up to this point, the reduction is the same as for weighted interval graphs • The latter proceeds 
by adding weights on the vertices of the interval graph G". Here, we will instead subdivide each interval 
li and Ti. We first describe the transformation for the intervals r^. Consider an interval representation 
of G' where each interval i\ is of the form r, ; = (y tl —1) U [—1, 1) for some real j/, < —1. We split each 
ri into i + 1 intervals r°, rj, . . . , r\ with 




(yi,-i/k) if j = 0; 

(-i/k,l/i) if J = 1, 
((j — l)A>iA) otherwise 



(See Figure [5] for an illustration.) Notice that the subgraph induced by R := {r{ | 1 < j < i < fc} is 
isomorphic to J^. A symmetric modification is made on the intervals li, the new intervals are denoted 
l\, l\ i\. We also let L := {£? \ 1 < j < i < fc}, and denote by J the graph obtained after the 
transformation on the intervals ri and li is completed. 
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Figure 5: Splitting of the r/s for k = 5. 



Now, we hang on each vertex v e V( J) — (L U R) a clique of cardinality k. Thus v is one of the 
vertices of the clique and the k — 1 other vertices are new vertices that are added to J. The resulting 
graph J' is not necessarily an interval graph, but it is still chordal. 

Let q* = (q* ) be the probability distribution over N + defined as 



(n + 2(A-t + l))/|J / | ifie 

otherwise, 



where n := \G'\. 

Claim 1. q* dominates every probability distribution realizable by a coloring of J'. 

Proof. Let 4> be any coloring of J' and denote by q = the probability distribution realized by 4>. 
Let U := V(J') — (L U R) and write q as q ~ q R + q L + q u , where q R (q L ,q u ) denotes the sequence q 
where only the contribution of vertices in R {L, U respectively) is taken into account. 
By Lemma [2J the sequence q R is dominated by 

q*> J « := (k/\J'\,(k-l)/\J'\,...,l/\J'\,0,...). 

(Although the dominance relation was originally defined for probability distributions only, it naturally 
extends to a pair q, r of sequences of nonnegative real numbers with Yli>i It = Si>i r i-) The same 
holds for q L . Moreover, q u is dominated by q*' u , where 



*,u 
1i 



n/\J'\ if i € {1,...,^}, 
otherwise. 



This is easily seen using the fact that V(J') can be partitioned into n cliques of cardinality k. Since 
q* = 2q*' Jk + q*- u , we deduce that q* dominates q. Hence, Claim [l] holds. □ 

Claim 2. G is k-colorable if and only if there exists a coloring of J' realizing q*. 

Proof. By the definition of J', a /c-coloring of G can be readily extended to a fc-coloring of J' realizing 
q*. In order to prove the other direction of the claim, assume that is a coloring of J' realizing q*. 
Using the second part of Lemma [2] we know that colors the vertices of R rowwise, that is, it assigns 
color k — i + 1 to all vertices of the i-row {r\ \ 1 < j < i}. Also, as <fi uses exactly k colors and rf is 
adjacent to r\, for 1 < i < i' < k, the vertex r° is also colored with color k — i + 1. Since the same 
observations apply also on L, from 4> we easily derive a fc-coloring of G. Claim [2] follows. □ 

By combining Claims [T] and [2] with Lemma [TJ we deduce the following: If G is fc-colorable, then 
every minimum entropy coloring of J' realizes q*. Furthermore, if G is not fc-colorable, then no 
coloring of J' realizes q* . Therefore, by computing a minimum entropy coloring of the chordal graph 
J', we could decide if the circular arc graph G is fc-colorable. Theorem [7] follows. □ 
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We remark that, as pointed out by a referee, the graph J' in the above reduction is actually strongly 
chordal (every cycle of even length at least 6 has a chord splitting the cycle in two even cycles) . Thus 
the minimum entropy coloring problem remains NP-hard on (unweighted) strongly chordal graphs. 
We note that interval graphs are a proper subclass of strongly chordal graphs, which in turn are a 
proper subclass of chordal graphs (see Q)- 



Greedy Coloring. The greedy algorithm can be used to find approximate minimum entropy col- 
orings. In the context of coloring problems, the greedy algorithm involves iteratively removing a 
maximum independent set in the graph, assigning a new color to each removed set. This algorithm is 
polynomial for families of graphs in which a maximum (size or weight) independent set can be found 
in polynomial time, such as perfect graphs. We therefore have the following result. 

Corollary 1. The minimum entropy coloring problem can be approximated within an additive error of 
log e (ps 1.4427) bits when restricted to perfect graphs. 

We now show an example of an "approximate greedy" algorithm yielding a bounded approximation 
error on any bounded-degree graphs. 

Lemma 3. The algorithm that iteratively removes a ^-approximate maximum independent set yields an 
approximation of the minimum entropy of a coloring within an additive error of log f3 + log e bits. 

Proof. The entropy of a coloring can be rewritten as (letting n := |V|): 

- v log = _ i y log \^wm . (9) 

z -— ' n n n ^— ' n 

i v£V 

Thus minimizing the entropy of a coloring : V — > N + of G is equivalent to maximizing the following 
product of the color class sizes over all vertices: 



n : =n^ -i w«))i. 



4> vev 

Consider a color class S in an optimal coloring <\>opt and the order in which the vertices of this subset 
are colored by the approximate greedy algorithm (breaking ties arbitrarily) . Let us denote by 4>g the 
coloring obtained with the approximate greedy algorithm. The first vertex in this order is assigned by 
4>g to a color class that has size at least \S\/j3 (since at that stage, we know there exists an independent 
set of size at least \S\). The next vertex is assigned to a color class of size at least (\S\ — 1)//?, and so 
on. Hence the product of the color class sizes restricted to the elements of S is at least |S|!//?I S L In the 
optimal solution, however, this product is \S\\ S \ by definition. Hence, denoting by S, the color classes 
of an optimal solution, the approximate greedy algorithm yields at least the following product: 

TT > TT l^iH > tt l^l' 5 ' 1 = n 0OPT 

11-1101*1 - 11(6.^)1^1 (e-/3)™' 1 j 

Converting this back to entropies using equation[9]yields the claimed result. □ 

Theorem 8. Minimum entropy coloring is approximable in polynomial time within an additive error of 
log(A + 2) — 0.1423 bits on graphs with maximum degree A. 

Proof. It is known that for graphs with maximum degree at most A, a p-approximate maximum inde- 
pendent set can be computed in polynomial time for p :— (A + 2)/3 (see Halldorsson and Radhakrish- 
nan ED). We have 



logp + loge = log(A + 2) +loge - log 3 < log (A + 2) - 0.1423. 
Hence, the claim follows from Lemma [3] □ 



11 



Coloring Interval Graphs. We give the following simple polynomial-time algorithm for approxi- 
mating the minimum entropy coloring problem on (unweighted) interval graphs. This algorithm is 
essentially the online coloring algorithm proposed by Kierstead and Trotter H25B . but in which the 
intervals are given in a specific order, and the intervals in Si are 2-colored offline. 

1. sort the intervals in increasing order of their right endpoints; let 
(vi, 1)2, ■ ■ ■ , V n ) be this ordering 

2 . for j <- 1 to n do 

• insert Vj in Si, where i is the smallest index such that Si U £2 U . . . Si U {vj} 
does not contain an (i + l)-clique 

3. color the intervals in S± with color 1 

4. let k be the number of nonempty sets Si 

5 . for i -s— 2 to k do 

• color the intervals in Si with colors 2i — 2 and 2i — 1 

This algorithm is also similar to the BETTER-MCA algorithm introduced by Pemmaraju, Raman, and 
Varadarajan for the max-coloring problem 11331 . but instead of sorting the intervals in order of their 
weights, we sort them in order of their right endpoints. It achieves the same goal as the algorithm of 
Nicoloso, Sarrafzadeh, and Song [31] for minimum sum coloring. 

Lemma 4. At the end of the algorithm, the graph induced by the vertices in Si U S 2 U . . . U Sj is a maximum 
i-colorable subgraph of G. 

Proof. If we consider the construction of the set Si U S2 ■ ■ ■ U Si, it matches exactly with an execution 
of the algorithm of Yannakakis and Gavril H37H for constructing a maximum i-colorable subgraph. □ 

Kierstead and Trotter [25] proved the following lemma, that shows that step [5] of the algorithm is 
always feasible. 

Lemma 5 ( 11251 ") ■ The graphs induced by the sets Si are bipartite. 

Let H' be the entropy of the probability distribution {|Si|/n}. The proof of the following relies on 
Lemma [4] 

Lemma 6. H' is a lower bound on the minimum entropy of a coloring of the interval graph. 

Proof. Let us consider a coloring with color classes Qi. We can assume without loss of generality that 
the classes Qi are labeled in order of decreasing sizes, that is, |Qj+i| < \Qi\- For any t < k: 

i=l i=l 

since from Lemma|4]the set Si U S2 U . . . U St induces a maximum i-colorable subgraph of G. It follows 
from Lemma[T]that the entropy of the distribution {|5j|/n} is smaller or equal to that of {|<2i|/n}. □ 

We deduce the following. 

Theorem 9. On (unweighted) interval graphs, the minimum entropy coloring problem can be approxi- 
mated within an additive error of 1 bit, in polynomial time. 
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Proof. The entropy of the coloring produced by the algorithm is at most that of the distribution { | Si | /n} 
plus one bit, since the intervals in Si are colored with at most two colors. From Lemma [6] the former 
is a lower bound on the optimum, and we get the desired approximation. □ 

To conclude this section, we mention that improved approximation results could be obtained on 
other special classes of graphs, using for instance the methods developed by Fukunaga, Halldorsson, 
and Nagamochi |fl7l[T6i 



5 Graph Entropy and Partial Order Production 

The notion of graph entropy was introduced by Korner in 1973 H26L and was initially motivated by a 
source coding problem. It has since found a wide range of applications (see for instance the survey by 
Simonyi [35]). 

The entropy of a graph G = (V, E) can be defined in several ways. We now give a purely com- 
binatorial (and not information-theoretic) definition, and restrict ourselves to unweighted (or, more 
precisely, uniformly weighted) graphs. Let us consider a probability distribution {q s } on the inde- 
pendent sets S of G, and denote by p v the probability that v belongs to an independent set drawn at 
random from this distribution (that is, p v ;= J2s3v Is)- Then the entropy of G is the minimum over 
all possible distributions {qs} of 

--Vlogft,. (11) 
n 

The feasible vectors (p v ) v ev form the stable set polytope STAB(G) of the graph G, defined as the 
following convex combination: 

STAB(G) := conv{l s : S independent set of G}, 

where l 5 is the characteristic vector of the set S C V, assigning the value 1 to vertices in S, and to 
the others. Thus, the entropy of G can be written as 



H(G):= min --V*logp„. 



peSTAB(G) n 

vGV 



The relation between the graph entropy and the chromatic entropy (the minimum entropy of a 
coloring) can be made clear from the following observation: if we restrict the vector p in the definition 
of H(G) to be a convex combination of characteristic vectors of disjoint independent sets, then the 



minimum of (111 is equal to the chromatic entropy (see J5j for a rigorous development) . Hence, 
the mathematical program defining the graph entropy is a relaxation of that defining the chromatic 
entropy. It follows that the graph entropy is a lower bound on the chromatic entropy. 

Recently, we showed that the greedy coloring algorithm, that iteratively colors and removes a 
maximum independent set, yields a good approximation of the graph entropy, provided the graph is 
perfect. 

Theorem 10 ([9|). Let G be a perfect graph, and g be the entropy of a greedy coloring of G. Then 

g<H(G)+log(H(G) + l)+0(l). 



The proof of Theorem 10 is a dual fitting argument based on the identity H (G) + H(G) = \ogn, 
that holds whenever G is perfect (Czisar, Korner, Lovasz, Marton and Simonyi [10 ] characterized the 
perfect graphs as the graphs that "split entropy", for all probability distribution on the vertices). In 



particular, Theorem 10 implies that the chromatic entropy of a perfect graph never exceeds its entropy 



by more than log log n + 0(1). It turns out that this is essentially tight ||9J- 
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Theorem 10 is the key tool in a recent algorithm for the partial order production problem J9j- In 
this problem, we want to bijectively map a set T of objects coming with an unknown total order to 
a vector equipped with partial order on its positions, such that the relations of the partial order are 
satisfied by the mapped elements. The problem admits the selection, multiple selection, sorting, and 
heap construction problems as special cases. Until recently, it was not known whether there existed a 
polynomial-time algorithm performing this task using a near-optimal number of comparisons between 
elements of T. By applying (twice) the greedy coloring algorithm and the approximation result of 
Theorem 10 we could reduce, in polynomial time, the problem to a well studied multiple selection 
problem and solve it with a near-optimal number of comparisons. The reader is referred to [[9j for 
further details. 
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